AITopics | online planning

AdaptiveOnlinePacking-guidedSearchforPOMDPs

Neural Information Processing SystemsFeb-11-2026, 19:47:20 GMT

Thepartially observableMarkovdecision process (POMDP) provides ageneral framework for modeling an agent's decision process with state uncertainty, and online planning plays a pivotal role in solving it. A belief is a distribution of states representing state uncertainty. Methods forlarge-scale POMDP problems rely on the same idea of sampling both states and observations.

artificial intelligence, machine learning, particle, (19 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.57)

Add feedback

a18aa23ee676d7f5ffb34cf16df3e08c-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 15:05:37 GMT

Real Time Dynamic Programming (RTDP) is an online algorithm based on Dynamic Programming (DP) that acts by 1-step greedy planning.

artificial intelligence, machine learning, skt, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Israel (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.55)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)

Add feedback

2e6d9c6052e99fcdfa61d9b9da273ca2-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 23:24:44 GMT

agent, simulation, simulator, (14 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands > South Holland > Delft (0.04)
North America > Canada (0.04)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
(3 more...)

Add feedback

Online Planning with Lookahead Policies

Neural Information Processing SystemsDec-24-2025, 09:33:11 GMT

Real Time Dynamic Programming (RTDP) is an online algorithm based on Dynamic Programming (DP) that acts by 1-step greedy planning. Unlike DP, RTDP does not require access to the entire state space, i.e., it explicitly handles the exploration. This fact makes RTDP particularly appealing when the state space is large and it is not possible to update all states simultaneously. In this we devise a multi-step greedy RTDP algorithm, which we call $h$-RTDP, that replaces the 1-step greedy policy with a $h$-step lookahead policy. We analyze $h$-RTDP in its exact form and establish that increasing the lookahead horizon, $h$, results in an improved sample complexity, with the cost of additional computations. This is the first work that proves improved sample complexity as a result of {\em increasing} the lookahead horizon in online planning. We then analyze the performance of $h$-RTDP in three approximate settings: approximate model, approximate value updates, and approximate state representation. For these cases, we prove that the asymptotic performance of $h$-RTDP remains the same as that of a corresponding approximate DP algorithm, the best one can hope for without further assumptions on the approximation errors.

lookahead policy, name change, online planning, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

2e6d9c6052e99fcdfa61d9b9da273ca2-Paper.pdf

Neural Information Processing SystemsOct-2-2025, 14:09:09 GMT

machine learning, natural language, simulator, (19 more...)

Neural Information Processing Systems

Country: Europe (0.14)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
(3 more...)

Add feedback

DyRo-MCTS: A Robust Monte Carlo Tree Search Approach to Dynamic Job Shop Scheduling

Chen, Ruiqi, Mei, Yi, Zhang, Fangfang, Zhang, Mengjie

arXiv.org Artificial IntelligenceSep-29-2025

Dynamic job shop scheduling, a fundamental combinatorial optimisation problem in various industrial sectors, poses substantial challenges for effective scheduling due to frequent disruptions caused by the arrival of new jobs. State-of-the-art methods employ machine learning to learn scheduling policies offline, enabling rapid responses to dynamic events. However, these offline policies are often imperfect, necessitating the use of planning techniques such as Monte Carlo Tree Search (MCTS) to improve performance at online decision time. The unpredictability of new job arrivals complicates online planning, as decisions based on incomplete problem information are vulnerable to disturbances. To address this issue, we propose the Dynamic Robust MCTS (DyRo-MCTS) approach, which integrates action robustness estimation into MCTS. DyRo-MCTS guides the production environment toward states that not only yield good scheduling outcomes but are also easily adaptable to future job arrivals. Extensive experiments show that DyRo-MCTS significantly improves the performance of offline-learned policies with negligible additional online planning time. Moreover, DyRo-MCTS consistently outperforms vanilla MCTS across various scheduling scenarios. Further analysis reveals that its ability to make robust scheduling decisions leads to long-term, sustainable performance gains under disturbances.

artificial intelligence, dyro-mct, planning & scheduling, (13 more...)

arXiv.org Artificial Intelligence

2509.21902

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Add feedback

Online Robust Planning under Model Uncertainty: A Sample-Based Approach

Shazman, Tamir, Lev-Yehudi, Idan, Benchetit, Ron, Indelman, Vadim

arXiv.org Artificial IntelligenceSep-22-2025

Online planning in Markov Decision Processes (MDPs) enables agents to make sequential decisions by simulating future trajectories from the current state, making it well-suited for large-scale or dynamic environments. Sample-based methods such as Sparse Sampling and Monte Carlo Tree Search (MCTS) are widely adopted for their ability to approximate optimal actions using a generative model. However, in practical settings, the generative model is often learned from limited data, introducing approximation errors that can degrade performance or lead to unsafe behaviors. To address these challenges, Robust MDPs (RMDPs) offer a principled framework for planning under model uncertainty, yet existing approaches are typically computationally intensive and not suited for real-time use. In this work, we introduce Robust Sparse Sampling (RSS), the first online planning algorithm for RMDPs with finite-sample theoretical performance guarantees. Unlike Sparse Sampling, which estimates the nominal value function, RSS computes a robust value function by leveraging the efficiency and theoretical properties of Sample Average Approximation (SAA), enabling tractable robust policy computation in online settings. RSS is applicable to infinite or continuous state spaces, and its sample and computational complexities are independent of the state space size. We provide theoretical performance guarantees and empirically show that RSS outperforms standard Sparse Sampling in environments with uncertain dynamics.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2509.10162

Country: Asia (0.46)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

Add feedback

ef41d488755367316f04fc0e0e9dc9fc-Paper.pdf

Neural Information Processing SystemsAug-18-2025, 16:51:29 GMT

artificial intelligence, machine learning, particle, (18 more...)

Neural Information Processing Systems

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.51)

Add feedback

Online Planning for Cooperative Air-Ground Robot Systems with Unknown Fuel Requirements

Agarwal, Ritvik, Hatami, Behnoushsadat, Gautam, Alvika, Maini, Parikshit

arXiv.org Artificial IntelligenceJun-27-2025

We consider an online variant of the fuel-constrained UAV routing problem with a ground-based mobile refueling station (FCURP-MRS), where targets incur unknown fuel costs. We develop a two-phase solution: an offline heuristic-based planner computes initial UAV and UGV paths, and a novel online planning algorithm that dynamically adjusts rendezvous points based on real-time fuel consumption during target processing. Preliminary Gazebo simulations demonstrate the feasibility of our approach in maintaining UAV-UGV path validity, ensuring mission completion. Link to video: https://youtu.be/EmpVj-fjqNY

artificial intelligence, planning & scheduling, target processing, (16 more...)

arXiv.org Artificial Intelligence

2506.20804

Genre: Research Report (0.50)

Industry:

Transportation (0.88)
Law > Statutes (0.40)
Law > Environmental Law (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.35)

Add feedback

Review for NeurIPS paper: Online Planning with Lookahead Policies

Neural Information Processing SystemsJan-27-2025, 04:23:40 GMT

Additional Feedback: COMMENTS AFTER REBUTTAL Thank you for your response. However, in this paper's case I find that the significance of the paper (i.e., support for your claim that "theoretical results provided in this work are important on their own") is severely lacking without experiments showing a link between this theory and an algorithm's performance in terms of measures like running time, number of 1-step Bellman backups, etc. ***Note: this is not a claim that every theoretical paper needs experiments; it applies only to this specific work, due to the theory issues mentioned in the original review.*** The rebuttal's attempted arguments against providing experiments really miss the mark: -- The rebuttal gives the "Beyond the one step greedy approach in RL" as an example of a paper similar in the degree of its theoretical focus to this submission, but that paper actually has experiments! Light experiments could do the job. That "Beyond the one step greedy approach in RL" paper that you mentioned yourself is a case in point.

experiment, lookahead policy, significance, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.34)

Add feedback

Filters

Collaborating Authors

online planning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

AdaptiveOnlinePacking-guidedSearchforPOMDPs

a18aa23ee676d7f5ffb34cf16df3e08c-Paper.pdf

2e6d9c6052e99fcdfa61d9b9da273ca2-Paper.pdf

Online Planning with Lookahead Policies

2e6d9c6052e99fcdfa61d9b9da273ca2-Paper.pdf

DyRo-MCTS: A Robust Monte Carlo Tree Search Approach to Dynamic Job Shop Scheduling

Online Robust Planning under Model Uncertainty: A Sample-Based Approach

ef41d488755367316f04fc0e0e9dc9fc-Paper.pdf

Online Planning for Cooperative Air-Ground Robot Systems with Unknown Fuel Requirements

Review for NeurIPS paper: Online Planning with Lookahead Policies